Mining Semi-structured Data
نویسندگان
چکیده
The need for discovering knowledge from XML documents according to both structure and content features has become challenging, due to the increase in application contexts for which handling both structure and content information in XML data is essential. So, the challenge is to find an hierarchical structure which ensure a combination of data levels and their representative structures. In this work, we will be based on the Formal Concept Analysis-based views to index and query both content and structure. We evaluate given structure in a querying process which allows the searching of user query answers.
منابع مشابه
Survey on Mining in Semi-Structured Data
Emerging technologies of semi-structured data have attracted wide attention of networks, e-commerce, information retrieval and databases. In these applications, the data are modeled not as static collections but as transient data streams, where the data source is an unbounded stream of individual data items. It is becoming increasingly popular to send heterogeneous and ill-structured data throu...
متن کاملOnline Algorithms for Mining Semi-structured Data Stream
In this paper, we study an online data mining problem from streams of semi-structured data such as XML data. Modeling semi-structured data and patterns as labeled ordered trees, we present an online algorithm StreamT that receives fragments of an unseen possibly infinite semistructured data in the document order through a data stream, and can return the current set of frequent patterns immediat...
متن کاملA Robust System Architecture for Mining Semi-Structured Data
The value of extracting knowledge from semi-structured data is readily apparent with the explosion of the WWW and the advent of digital libraries. This paper proposes a versatile system architecture for text mining that maintains structured data components in a relational database and unstructured concepts in a concept library. After a detailed explanation of our system architecture, we briefly...
متن کاملMISTA Mining in Semi-Structured Data
Semi-structured data arises when the source or the environment does not impose a rigid structure on the data and when data is combined from several heterogeneous sources. Examples include the World Wide Web and bioinformatics databases; the situation also occurs in datawarehousing. Unlike (nearly) unstructured raw data such as image and sound, semi-structured data has some structure: objects sh...
متن کاملEfficient Algorithms for Discovering Frequent and Maximal Substructures from Large Semistructured Data
In this paper, we review recent advances in efficient algorithms for semi-structured data mining , that is, discovery of rules and patterns from structured data such as sets, sequences, trees, and graphs. After introducing basic definitions and problems, We present efficent algorithms for frequent and maximal pattern mining for classes of sets, sequences, and trees. In particular, we explain ge...
متن کاملSemi-Structured Data Extraction and Schema Knowledge Mining
It is well known that World Wide Web has become a huge information resource. Therefore, it is very important for us to utilize this kind of information effectively. This paper proposes a semi-structured data extraction method to get the useful information embedded in a group of relevant web pages, and store it with OEM(Object Exchange Model). Then, we adopt data mining method to discover schema...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1504.04031 شماره
صفحات -
تاریخ انتشار 2015